fix(generic_files): flint_sprintf on 32-bit glibc (closes #2646) by edgarcosta · Pull Request #2648 · flintlib/flint

edgarcosta · 2026-04-27T20:17:52Z

Attempt to fix #2646.

On 32-bit glibc (i386, armhf), vsnprintf(dst, n, fmt, …) with n ≳ 16 MB silently drops everything after the first character. flint_vsprintf was routing through flint_vsnprintf(s, INT_MAX, …), so every flint_sprintf("x%wd", k) produced "x" instead of "x<k>".

fmpz_mpoly_set_str_pretty builds variable names with this exact call, so every variable collapsed to "x". The parser then failed prefix matching, returned -1, and left the polynomial malformed, which is why the three deterministic tests in #2646 fail on 32-bit:

mpoly_test_irreducible: FAIL: check 8 variable example
fmpz_mpoly_compose_fmpz_mpoly: Check non-example 1
nmod_mpoly_compose_nmod_mpoly: Check non-example 1

Fix

Give flint_sprintf / flint_vsprintf their own sink in a new src/generic_files/io_vsprintf.c that calls system vsprintf directly. sprintf semantics already require the caller to provide a sufficiently large buffer, so no length bound is needed and the glibc edge case is avoided. flint_snprintf is unchanged.

Potential alternative fix:

A smaller change that also fixes the bug is a 6-line diff in flint_vsnprintf_vprintf itself:

if (avail > ((size_t) 1 << 16))
    res = vsprintf(dst, fmt, ap_copy);
else
    res = vsnprintf(dst, avail, fmt, ap_copy);

Pros: no new file, no duplicated sink boilerplate (~120 fewer lines).

Cons: subtle behavior change for flint_snprintf callers passing n > 64 KB, they no longer get truncation at n-1. In practice this path was already broken on 32-bit glibc (the very bug we're fixing), and anyone using snprintf with n > 64 KB is effectively using it as sprintf. But it does cross the architectural line between bounded and unbounded writes. Also, the 1 << 16 threshold is a hardcoded constant tuned to the observed glibc behavior, which is not great: it's not principled, and a future glibc change could shift the threshold without us noticing.

I went with the separate-sink version because it preserves snprintf's contract exactly and avoids the hardcoded threshold. Happy to switch on request.

Test plan

New flint_sprintf regression test (src/test/t-io.c) covering %wd round-trip, WORD_MIN/WORD_MAX, and mixed %wd %wu %wx. With the fix reverted, it reports the exact failure: flint_sprintf("x%wd", 1) gave "x" expected "x1" on i386.
Three tests from Test failures on 32-bit ARM build of FLINT 3.5.0 #2646 pass on i386 native (-m32).
Full make check passes on i386 (-m32) and amd64.
flint_sprintf, mpoly_test_irreducible, fmpz_mpoly_compose_fmpz_mpoly, nmod_mpoly_compose_nmod_mpoly pass on armhf under qemu-arm-static. (qemu-user did not reproduce the original failure, only real 32-bit glibc does, but confirms no regression.)
Real armhf hardware verification: would appreciate a re-run by @d-torrance on Debian armhf to close out Test failures on 32-bit ARM build of FLINT 3.5.0 #2646.

`flint_vsprintf` previously routed through `flint_vsnprintf` with a size of `INT_MAX`, but on 32-bit glibc `vsnprintf(dst, n, ...)` silently drops output past the first character once `n` exceeds about 16 MB. The result on i386 and armhf was that `flint_sprintf("x%wd", 1)` produced `"x"` instead of `"x1"`. This broke `fmpz_mpoly_set_str_pretty` (which builds variable names via `flint_sprintf("x%wd", i + 1)`): every variable became the literal string `"x"`, the parser then failed prefix matching, returned `-1`, and left the polynomial malformed. Tests that exercise the parser then saw a zero/garbage polynomial where they expected a deterministic input — the symptom reported in flintlib#2646: mpoly_test_irreducible FAIL: check 8 variable example fmpz_mpoly_compose_fmpz_mpoly Check non-example 1 nmod_mpoly_compose_nmod_mpoly Check non-example 1 Fix: give `flint_sprintf` its own sink in a new `io_vsprintf.c` that calls the system `vsprintf` directly. `sprintf` semantics already require the caller to provide a sufficiently large buffer, so no length bound is needed and the glibc edge case is avoided. Also add a regression test that catches the precise failure mode (`flint_sprintf("x%wd", n)` round-trip plus a few `WORD_MIN/MAX` cases and a mixed-`%w` format) and is registered in `src/test/main.c`. Verified by reverting only the `io_vsnprintf.c`/`io_vsprintf.c` changes: the new test reports `flint_sprintf("x%wd", 1) gave "x" expected "x1"` on i386, then passes once the fix is restored. Full `make check` passes on i386, amd64, and armhf (under qemu-arm-static). Closes flintlib#2646.

Caught by the MinGW64 (LLP64) CI run on the previous commit (141266a): slong on Windows is `long long` (64-bit) but `long` is 32-bit, so the expected-value computation `snprintf(expected, ..., "%ld", (long) values[ix])` truncated WORD_MIN/WORD_MAX to 0 and the test reported a false failure "flint_sprintf(\"[%wd]\", -9223372036854775808) gave ... expected \"[0]\"". Cast to `long long` and use `%lld` instead, which fits slong on every supported platform (slong is `long` on LP64 and `long long` on LLP64).

albinahlback · 2026-04-27T20:36:48Z

Oh nice, and nice that you added a test file as well! I can confirm that changing INT_MAX to something reasonable fixes it on cfarm26

d-torrance · 2026-04-28T00:47:08Z

This fixed the tests from #2646 on the Debian armhf porterbox!

However, I'm getting a new test failure I didn't see earlier. Could this be related to the changes?

gr_poly_log_series...
FAIL

Ring of 3 x 3 matrices over Rational field (fmpq)
n = 5
a = [[1, 0, 0],
[0, 1, 0],
[0, 0, 1]] + [[0, 0, -508],
[0, -406/3, 0],
[0, -13/6, 1/46]]*x^2 + [[0, 0, 0],
[0, 1, -8],
[-1/9, 0, -1]]*x^6 + [[0, 0, 0],
[-1000, 0, 0],
[0, 0, 0]]*x^7 + [[0, 0, 0],
[-1/460, 0, 0],
[0, 0, 0]]*x^9

b = [[1, 0, 0],
[0, 1, 0],
[0, 0, 1]] + [[0, 0, -2],
[1/89, -4797924355589564652257283/2, 0],
[0, 0, 1/512]]*x + [[0, -2/519, 0],
[-35740566643349127160, 59, 1],
[0, -1/32, 0]]*x^5 + [[4228641788/1751, 0, 0],
[-1, 0, 0],
[0, -1, 0]]*x^6 + [[0, -117/4, 0],
[0, -309485000602476631497375743/2166527304234035068851421184, 0],
[0, 51555860481/1051, 2]]*x^8 + [[-1, 0, 0],
[0, -16/143, 0],
[0, 936/553, 1]]*x^9

fa = [[0, 0, -508],
[0, -406/3, 0],
[0, -13/6, 1/46]]*x^2 + [[0, -1651/3, 127/23],
[0, -82418/9, 0],
[0, -242749/1656, -1/4232]]*x^4

fb = [[0, 0, -2],
[1/89, -4797924355589564652257283/2, 0],
[0, 0, 1/512]]*x + [[0, 0, 1/512],
[4797924355589564652257283/356, -23020078121959539233172234142846181028787226542089/8, 1/89],
[0, 0, -1/524288]]*x^2 + [[0, 0, -1/393216],
[7673359373986513077724078047615393676262408847363/356, -36816197829641385988107883389753964062591689754643694289852109548152094729/8, 1228268635030928550977864447/68352],
[0, 0, 1/402653184]]*x^3 + [[0, 0, 1/268435456],
[110448593488924157964323650169261892187775069263931082869556328644456284187/2848, -529923996741120226857499244772841880262055948334272200313980355417026060719203236203553404088483921/64, 1508643839800740363185175535557298684871671127684480257/46661632],
[0, 0, -1/274877906944]]*x^4

ab = [[1, 0, 0],
[0, 1, 0],
[0, 0, 1]] + [[0, 0, -2],
[1/89, -4797924355589564652257283/2, 0],
[0, 0, 1/512]]*x + [[0, 0, -508],
[0, -406/3, 0],
[0, -13/6, 1/46]]*x^2 + [[0, 0, -127/128],
[-406/267, 324659548061560541469409483, 0],
[-13/534, 20791005540888113493114893/4, 1/23552]]*x^3

fafb = [[0, 0, -2],
[1/89, -4797924355589564652257283/2, 0],
[0, 0, 1/512]]*x + [[0, 0, -260095/512],
[4797924355589564652257283/356, -69060234365878617699516702428538543086361679629515/24, 1/89],
[0, -13/6, 262121/12058624]]*x^2 + [[0, 0, -1/393216],
[7673359373986513077724078047615393676262408847363/356, -36816197829641385988107883389753964062591689754643694289852109548152094729/8, 1228268635030928550977864447/68352],
[0, 0, 1/402653184]]*x^3 + [[0, -1651/3, 34091302935/6174015488],
[110448593488924157964323650169261892187775069263931082869556328644456284187/2848, -4769315970670082041717493202955576922358503535008449802825823198753234546472829125831980636801630041/576, 1508643839800740363185175535557298684871671127684480257/46661632],
[0, -242749/1656, -34359738897/145410412773376]]*x^4

fab = [[0, 0, -2],
[1/89, -4797924355589564652257283/2, 0],
[0, 0, 1/512]]*x + [[0, 0, -260095/512],
[4797924355589564652257283/356, -69060234365878617699516702428538543086361679629515/24, 1/89],
[0, -13/6, 262121/12058624]]*x^2 + [[0, -26/9, -5720087/9043968],
[69060234365878617699516702428538543086361679623019/3204, -36816197829641385988107883389753964062591689754643694289852109548152094729/8, 1228268635030928550978124543/68352],
[-13/801, 15967492255402071162712237837/4608, 1/402653184]]*x^3 + [[-13/534, 5322497418467357054236849071/1024, 34091302935/6174015488],
[110448593488924157964323650169261892187775069258736530100571359980945732459/2848, -424469121389637301712856895063046346089906814615752032451498264689037874636081792199046276675345072401/51264, 34698808315417028353266385076272500853073524125272507159/1073217536],
[13/546816, -1101756965622742910258962007681/217055232, -34359738897/145410412773376]]*x^4

make: *** [Makefile:792: build/gr_poly/test/main_TEST_RUN] Aborted
make: *** Waiting for unfinished jobs....

edgarcosta · 2026-04-28T01:20:48Z

@d-torrance Thank you! I will investigate. Could you specify in what kind of machine you are observing this?

The pre-existing failure was likely just hidden earlier because something further up the test list aborted first.

d-torrance · 2026-04-28T01:31:10Z

This is on amdahl.debian.org, one of Debian's ARM porterboxes. The machine itself is 64-bit, but the tests were run in a 32-bit environment using schroot.

edgarcosta · 2026-04-28T01:58:07Z

Here is my investigation:

The gr_poly_log_series failure is a pre-existing ARM-specific bug:

On i386 native (-m32), with my fix: gr_poly_log_series passes deterministically.
On armhf under qemu-arm-static, with my fix: gr_poly_log_series FAILs (SIGABRT, exit 134).
On armhf under qemu-arm-static, with my fix reverted to current upstream/main: gr_poly_log_series FAILs byte-identically (diff -q of the two outputs reports no difference). So this PR is not the cause.
grep -rn "flint_sprintf\|flint_snprintf\|flint_vsprintf\|flint_vsnprintf" src/gr_poly/ src/gr_mat/ src/gr/ returns nothing, so none of those modules go through the code I touched. It was likely just hidden earlier because something further up the test list aborted first.

I did some quick bisection. With FLINT_BITS=32 (same RNG seed on both 32-bit archs), iters 0-54 produce identical RNG state on i386 and armhf. At iter 55 both pick GR_CTX_NF (number field) via gr_ctx_init_random. After that single call, the RNG state has diverged: i386 and armhf consumed different numbers of n_randint calls inside the same code path. The number-field path goes through fmpz_poly_randtest_irreducible (src/gr/init_random.c:157) which loops on irreducibility tests, so the divergence is most likely upstream of gr_poly_log_series entirely, in fmpz_poly_factor or fmpz_mod_poly_randtest_irreducible on 32-bit ARM. The matrix-ring failure at iter 259 is just the random ring that gets picked after the divergence accumulates.

We should certainly open another issue. I will keep investigating regardless.

albinahlback · 2026-04-28T02:18:01Z

I did get compilation warning for truncations of integer literals when compiling on a 32 bit machines, that I believe was from new-ish code. Not sure if this may cause some issues.

…

On Tue, Apr 28, 2026, 03:58 Edgar Costa ***@***.***> wrote: *edgarcosta* left a comment (flintlib/flint#2648) <#2648 (comment)> Here is my investigation: The gr_poly_log_series failure is a pre-existing ARM-specific bug: - On *i386 native* (-m32), with my fix: gr_poly_log_series passes deterministically. - On *armhf under qemu-arm-static*, with my fix: gr_poly_log_series FAILs (SIGABRT, exit 134). - On *armhf under qemu-arm-static*, with my fix reverted to current upstream/main: gr_poly_log_series FAILs *byte-identically* (diff -q of the two outputs reports no difference). So this PR is not the cause. - grep -rn "flint_sprintf\|flint_snprintf\|flint_vsprintf\|flint_vsnprintf" src/gr_poly/ src/gr_mat/ src/gr/ returns nothing, so none of those modules go through the code I touched. It was likely just hidden earlier because something further up the test list aborted first. I did some quick bisection. With FLINT_BITS=32 (same RNG seed on both 32-bit archs), iters 0-54 produce identical RNG state on i386 and armhf. At iter 55 both pick GR_CTX_NF (number field) via gr_ctx_init_random. After that single call, the RNG state has diverged: i386 and armhf consumed different numbers of n_randint calls inside the same code path. The number-field path goes through fmpz_poly_randtest_irreducible ( src/gr/init_random.c:157) which loops on irreducibility tests, so the divergence is most likely upstream of gr_poly_log_series entirely, in fmpz_poly_factor or fmpz_mod_poly_randtest_irreducible on 32-bit ARM. The matrix-ring failure at iter 259 is just the random ring that gets picked after the divergence accumulates. We should certainly open another issue. I will keep investigating regardless. — Reply to this email directly, view it on GitHub <#2648 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AOGBZ54ZJOSF6TAFQY3MAXD4YAF4LAVCNFSM6AAAAACYINIMF6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHM2DGMZRG44TOMRWGQ> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>. You are receiving this because you commented.Message ID: ***@***.***>

fredrik-johansson · 2026-04-28T13:27:10Z

Looks good to me. Thanks!

edgarcosta added 2 commits April 27, 2026 15:26

docs(history): add 3.6.0-dev entry for flintlib#2646 fix

daa4b0d

edgarcosta mentioned this pull request Apr 28, 2026

fix(gr): arg-eval-order divergence in randtest helpers (closes #2646) #2649

Merged

4 tasks

fredrik-johansson merged commit 4a2ed06 into flintlib:main Apr 28, 2026
13 checks passed

fredrik-johansson mentioned this pull request Apr 28, 2026

Fix 32bit segfault in flint_sprintf #2647

Closed

d-torrance mentioned this pull request Apr 29, 2026

gr_nmod_redc fails on ppc64el #2651

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(generic_files): flint_sprintf on 32-bit glibc (closes #2646)#2648

fix(generic_files): flint_sprintf on 32-bit glibc (closes #2646)#2648
fredrik-johansson merged 3 commits into
flintlib:mainfrom
edgarcosta:fix/2646-armhf-tests

edgarcosta commented Apr 27, 2026

Uh oh!

albinahlback commented Apr 27, 2026

Uh oh!

d-torrance commented Apr 28, 2026

Uh oh!

edgarcosta commented Apr 28, 2026 •

edited

Loading

Uh oh!

d-torrance commented Apr 28, 2026

Uh oh!

edgarcosta commented Apr 28, 2026

Uh oh!

albinahlback commented Apr 28, 2026 via email

Uh oh!

Uh oh!

fredrik-johansson commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

edgarcosta commented Apr 27, 2026

Fix

Potential alternative fix:

Test plan

Uh oh!

albinahlback commented Apr 27, 2026

Uh oh!

d-torrance commented Apr 28, 2026

Uh oh!

edgarcosta commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

d-torrance commented Apr 28, 2026

Uh oh!

edgarcosta commented Apr 28, 2026

Uh oh!

albinahlback commented Apr 28, 2026 via email

Uh oh!

Uh oh!

fredrik-johansson commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

edgarcosta commented Apr 28, 2026 •

edited

Loading